Skip to content

Debug gdas.x NaN abort: update sorc/fv3-jedi to Gaussian-grid read path (RussTreadon-NOAA fork @81a8e1c)#2098

Draft
Copilot wants to merge 2 commits intodevelopfrom
copilot/debug-gdas-nan-issue
Draft

Debug gdas.x NaN abort: update sorc/fv3-jedi to Gaussian-grid read path (RussTreadon-NOAA fork @81a8e1c)#2098
Copilot wants to merge 2 commits intodevelopfrom
copilot/debug-gdas-nan-issue

Conversation

Copy link
Contributor

Copilot AI commented Mar 20, 2026

gdas.x (3D-Var) aborts with NaN in H(x) when reading backgrounds from Gaussian-grid GFS files via the new IOStructuredGrid read path. Root cause: missing std::isnan guard in readVarToStructuredAtlasField.

Root cause

In IEEE 754, std::abs(NaN) > threshold and NaN == x both return false, so the existing fill-value check in IOStructuredGrid::readVarToStructuredAtlasField silently passes NaN float values from the NetCDF background files into the Atlas FieldSet. Those NaN values survive GlobalInterpolator::apply() onto the cubed-sphere grid and propagate into H(x).

// 81a8e1c — NaN bypasses every test; propagates silently to H(x)
const bool isMissing = (std::abs(val) > kMissingThreshold)
                     || (val == fillValue)
                     || (val == missingValue);

// Fix (companion PR to RussTreadon-NOAA/fv3-jedi)
const bool isMissing = std::isnan(val)           // must be first
                     || (std::abs(val) > kMissingThreshold)
                     || (val == fillValue)
                     || (val == missingValue);

Confirmed by examining atmanlvar.yaml: background uses filetype: structured grid, gridtype: gaussian with files gdas.t18z.atm.f006.nc / gdas.t18z.sfc.f006.nc on a C96 (F96) geometry.

Changes

  • .gitmodulessorc/fv3-jedi URL/branch updated from jcsda/fv3-jedi@develop (SHA 3d29a037, partial Gaussian path) to RussTreadon-NOAA/fv3-jedi@copilot/debug-segmentation-fault-issue (SHA 81a8e1c), which provides the complete Gaussian-grid read/write path and MPI GlobalInterpolator desync fixes. TODO comment added to restore upstream once fixes land in jcsda/fv3-jedi.
  • sorc/fv3-jedi — submodule pointer bumped 3d29a03781a8e1c.

Companion PRs

  • RussTreadon-NOAA/fv3-jedi — add std::isnan(val) as the first term of isMissing in IOStructuredGrid::readVarToStructuredAtlasField (required before the NaN abort is fully resolved; submodule pointer will need a second bump to that SHA).

Issues

Automated CI tests to run in Global Workflow

  • atm_jjob
  • C96C48_ufs_hybatmDA
  • C96C48_hybatmsnowDA
  • C96_gcafs_cycled
  • C48mx500_3DVarAOWCDA
  • C48mx500_hybAOWCDA
  • C96C48_ufsgsi_hybatmDA
  • C48_ufsenkf_atmDA
  • C96C48_hybatmDA

⌨️ Start Copilot coding agent tasks without leaving your editor — available in VS Code, Visual Studio, JetBrains IDEs and Eclipse.

Copilot AI changed the title [WIP] Debug NaN issue in gdas.x forward model calculation Debug: identify NaN source in gdas.x forward model — missing std::isnan check in IOStructuredGrid Gaussian read path Mar 20, 2026
Copilot AI requested a review from RussTreadon-NOAA March 20, 2026 18:28
@RussTreadon-NOAA
Copy link
Contributor

The attached file, atmanlvar.yaml, is the input yaml read by gdas.x
atmanlvar.yaml

Note: https://github.com/RussTreadon-NOAA/fv3-jedi/tree/copilot/debug-segmentation-fault-issue at RussTreadon-NOAA/fv3-jedi@81a8e1c contains code to read backgrounds from gaussian grid GFS atmospheric and surface files.

…ad path and MPI fixes

- .gitmodules: change url from jcsda/fv3-jedi.git@develop to
  RussTreadon-NOAA/fv3-jedi.git@copilot/debug-segmentation-fault-issue.
  Added TODO comment to restore the upstream URL once all Gaussian-grid IO
  changes (including the std::isnan NaN guard in readVarToStructuredAtlasField)
  are merged back into jcsda/fv3-jedi develop.

- sorc/fv3-jedi: bump submodule pointer from 3d29a037 (feature/read_gaussian,
  only partial Gaussian read path) to 81a8e1c (copilot/debug-segmentation-fault-issue,
  full Gaussian read/write path + MPI GlobalInterpolator desync fixes).

Root cause of the NaN in gdas.x (found from atmanlvar.yaml analysis):
  IOStructuredGrid::readVarToStructuredAtlasField() was missing std::isnan(val)
  in the isMissing check. In IEEE 754, std::abs(NaN)>threshold and NaN==x both
  return false, so NaN values in the Gaussian background files passed silently
  through to H(x). Companion fix in RussTreadon-NOAA/fv3-jedi is required.

Co-authored-by: RussTreadon-NOAA <26926959+RussTreadon-NOAA@users.noreply.github.com>
Agent-Logs-Url: https://github.com/NOAA-EMC/GDASApp/sessions/5c1d6952-3e2f-4cda-8ce5-0c1b37000d73
Copilot AI changed the title Debug: identify NaN source in gdas.x forward model — missing std::isnan check in IOStructuredGrid Gaussian read path Debug gdas.x NaN abort: update sorc/fv3-jedi to Gaussian-grid read path (RussTreadon-NOAA fork @81a8e1c) Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants